-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU] Added hot-block-rematerialize pass #126331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ With the latest revision this PR passed the undef deprecator. |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
Added Jay and Carl fyi (still needs cleanup) |
|
Since this is only a draft and MirDivergence parts are going away I only I had a quick look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to #include <cmath> for this. But perhaps it would be better to use an integer-only power operation (not sure if we already have one anywhere in LLVM).
|
Is this intended to be run just before pre-RA MachineScheduler? Update: I see the pass currently requires SSA form, so it would have to be run earlier than that. Where would you put it in the pass pipeline? |
|
We've got some work in the pipeline to allow pre-RA machinescheduler to run in SSA form as well, so that might be ok. |
…t I don't see any reason why we can't just require SSA
ede9e76 to
971e556
Compare
|
Hi @adam-yang, I've done some testing with this patch and noticed that options |
Yes could you send them to me please. |
|
OK, here's a somewhat reduced test case for a crash with |
|
Here's one for |
This revealed some incorrect assumptions in this PR. Lanes are assumed to be 32-bit, but new 16-bit subregs have been added so lanes are now 16-bit. I'll have to go through the change to fix some more issues related to this. |
Note: Before this is ready for review, I intend to remove AMDGPUMirSyncDependency and AMDGPUMirDivergenceAnalysis. They were basically forks of the IR versions of the passes from before the upstream MIR versions were available.
I have the code path that uses
MachineUniformityInfoand the divergence related unit test passes with it, but I'd like to do more testing before deleting the other ones.